add QTE support for covariate adaptive randomization by okiner-3 · Pull Request #107 · CyberAgentAILab/python-dte-adjustment

okiner-3 · 2026-04-28T15:51:01Z

Implement predict_qte in SimpleStratifiedDistributionEstimator and AdjustedStratifiedDistributionEstimator with stratified bootstrap (resampling within each stratum independently) to correctly estimate variance under CAR designs.

close #64

Implement predict_qte in SimpleStratifiedDistributionEstimator and AdjustedStratifiedDistributionEstimator with stratified bootstrap (resampling within each stratum independently) to correctly estimate variance under CAR designs.

Copilot

Pull request overview

Adds CAR-compatible QTE inference by overriding predict_qte in the stratified estimators to use stratified bootstrap (resampling independently within strata) for variance estimation.

Changes:

Implement predict_qte for SimpleStratifiedDistributionEstimator using stratified bootstrap.
Implement predict_qte for AdjustedStratifiedDistributionEstimator using stratified bootstrap.
Add required imports (Optional, norm) to support the new QTE CI computation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T07:55:22Z

+            # Stratified bootstrap: resample within each stratum independently
+            bootstrap_indexes = np.concatenate([
+                np.random.choice(idx, size=len(idx), replace=True)
+                for idx in strata_indices.values()
+            ])
+
+            qtes[b] = self._compute_qtes(
+                target_treatment_arm,
+                control_treatment_arm,
+                quantiles,
+                self.covariates[bootstrap_indexes],
+                self.treatment_arms[bootstrap_indexes],
+                self.outcomes[bootstrap_indexes],
+                self.strata[bootstrap_indexes],
+            )


This PR changes QTE variance estimation under CAR by using stratified bootstrap, but the existing unit tests only assert shapes and basic ordering. Please add a test that would fail if bootstrapping were not stratified (e.g., assert each bootstrap replicate preserves per-stratum sample counts, or compare variance vs an unstratified bootstrap on an imbalanced-strata synthetic dataset).

Copilot · 2026-04-29T07:55:22Z

+        quantiles: Optional[np.ndarray] = None,
+        alpha: float = 0.05,
+        n_bootstrap=500,
+        display_progress: bool = True,
+    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
+        """
+        Compute Quantile Treatment Effects (QTE) using stratified bootstrap.
+
+        Uses stratified bootstrap (resampling independently within each stratum) to
+        correctly estimate variance under covariate adaptive randomization (CAR).
+
+        Args:
+            target_treatment_arm (int): The index of the treatment arm of the treatment group.
+            control_treatment_arm (int): The index of the treatment arm of the control group.
+            quantiles (np.ndarray, optional): Quantiles used for QTE. Defaults to [0.1, 0.2, ..., 0.9].
+            alpha (float, optional): Significance level of the confidence bound. Defaults to 0.05.
+            n_bootstrap (int, optional): Number of bootstrap samples. Defaults to 500.
+            display_progress (bool, optional): Whether to display a progress bar. Defaults to True.
+
+        Returns:
+            Tuple[np.ndarray, np.ndarray, np.ndarray]: A tuple containing:
+                - Expected QTEs (np.ndarray): Treatment effect estimates at each quantile
+                - Lower bounds (np.ndarray): Lower confidence interval bounds
+                - Upper bounds (np.ndarray): Upper confidence interval bounds
+        """
+        qte = self._compute_qtes(
+            target_treatment_arm,
+            control_treatment_arm,
+            quantiles,
+            self.covariates,
+            self.treatment_arms,
+            self.outcomes,
+            self.strata,
+        )


Same as above: quantiles is optional in the signature/docs but is passed directly into _compute_qtes, which expects an ndarray and will break on None. Please initialize the default quantile grid when quantiles is None (and validate range/order).

Copilot · 2026-04-29T07:55:23Z

+        qte = self._compute_qtes(
+            target_treatment_arm,
+            control_treatment_arm,
+            quantiles,
+            self.covariates,
+            self.treatment_arms,
+            self.outcomes,
+            self.strata,
+        )


AdjustedStratifiedDistributionEstimator._compute_cumulative_distribution draws a fresh random fold assignment on each call (folds = np.random.randint(...)). Because predict_qte calls _compute_qtes many times inside the bootstrap, the resulting CI will include extra Monte Carlo noise from re-randomizing folds, not just resampling variability. Consider fixing folds once (e.g., store them at fit time or accept a random_state and reuse a RNG/seed) so bootstrap variance reflects sampling uncertainty only.

Copilot · 2026-04-29T07:55:23Z

+        quantiles: Optional[np.ndarray] = None,
+        alpha: float = 0.05,
+        n_bootstrap=500,
+        display_progress: bool = True,
+    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
+        """
+        Compute Quantile Treatment Effects (QTE) using stratified bootstrap.
+
+        Uses stratified bootstrap (resampling independently within each stratum) to
+        correctly estimate variance under covariate adaptive randomization (CAR).
+
+        Args:
+            target_treatment_arm (int): The index of the treatment arm of the treatment group.
+            control_treatment_arm (int): The index of the treatment arm of the control group.
+            quantiles (np.ndarray, optional): Quantiles used for QTE. Defaults to [0.1, 0.2, ..., 0.9].
+            alpha (float, optional): Significance level of the confidence bound. Defaults to 0.05.
+            n_bootstrap (int, optional): Number of bootstrap samples. Defaults to 500.
+            display_progress (bool, optional): Whether to display a progress bar. Defaults to True.
+
+        Returns:
+            Tuple[np.ndarray, np.ndarray, np.ndarray]: A tuple containing:
+                - Expected QTEs (np.ndarray): Treatment effect estimates at each quantile
+                - Lower bounds (np.ndarray): Lower confidence interval bounds
+                - Upper bounds (np.ndarray): Upper confidence interval bounds
+        """
+        qte = self._compute_qtes(
+            target_treatment_arm,
+            control_treatment_arm,
+            quantiles,
+            self.covariates,
+            self.treatment_arms,
+            self.outcomes,
+            self.strata,
+        )


quantiles is documented as optional with a default ([0.1, …, 0.9]) but it’s passed straight into _compute_qtes. If the caller leaves quantiles=None, _compute_qtes will error when accessing quantiles.shape. Please set a default array when quantiles is None (and ideally validate they’re in (0,1)).

Copilot · 2026-04-29T07:55:23Z

+        # Precompute stratum indices for stratified bootstrap
+        unique_strata = np.unique(self.strata)
+        strata_indices = {s: np.where(self.strata == s)[0] for s in unique_strata}
+
+        qtes = np.zeros((n_bootstrap, qte.shape[0]))
+        bootstrap_iter = range(n_bootstrap)
+        if display_progress:
+            bootstrap_iter = tqdm(bootstrap_iter, desc="Bootstrap QTE")
+        for b in bootstrap_iter:
+            # Stratified bootstrap: resample within each stratum independently
+            bootstrap_indexes = np.concatenate([
+                np.random.choice(idx, size=len(idx), replace=True)
+                for idx in strata_indices.values()
+            ])


The stratified-bootstrap implementation here is duplicated verbatim in both stratified estimator classes. Consider extracting it into a shared private helper (or into DistributionEstimatorBase) to avoid divergence/bugs when one implementation is updated and the other isn’t.

TomeHirata · 2026-05-04T05:59:32Z

            conditional_prediction[:, 1:] - conditional_prediction[:, :-1],
        )

+    def predict_qte(


Thank you for the addition! But it seems this is identical to BaseEstimator.predict_qte. Does this mean #64 is already fixed by the previous refactor and DTE for CAR is working fine as it is?

Thanks for the careful read! You're right that the structure looks almost identical, but there's one material difference — the bootstrap resampling step.

BaseEstimator.predict_qte uses a plain bootstrap: np.random.choice(indexes, size=n_obs,　replace=True) over the entire sample.

This PR resamples within each stratum independently (per-stratum np.random.choice(idx, size=len(idx), replace=True), then concatenates).

That said, you're absolutely right that everything outside the bootstrap step is copy-pasted from the base, which is misleading. I'll address that with Copilot's #5 suggestion (extract the bootstrap loop into a shared helper, override only the resampling step) so the actual delta vs. base is obvious in the diff.

Make sense, thanks for the explanation!

TomeHirata

LGTM once copilot comments are addressed!

predict_qte signatures documented [0.1, ..., 0.9] as the default but passed None straight to _compute_qtes, which calls .shape on it and raises AttributeError. Materialize the default and validate that all quantiles lie in (0, 1) at the top of every predict_qte entry point.

Move the bootstrap loop into the base class and switch it to stratified resampling (per-stratum np.random.choice). Stratified resampling on a single stratum is equivalent to plain bootstrap, so SimpleDistributionEstimator and AdjustedDistributionEstimator (which set strata to a constant) remain unchanged in behavior while the CAR-aware variants pick up the correct variance estimator without any override. This removes the duplicated predict_qte bodies in both stratified subclasses, leaving the only delta vs. the base implementation in the resampling step.

- test_predict_qte_preserves_per_stratum_counts: spy on _compute_qtes and assert every bootstrap replicate has the same per-stratum counts as the original sample. This would fail under a plain bootstrap. - test_predict_qte_default_quantiles: predict_qte without quantiles returns shape (9,) for the [0.1, ..., 0.9] default. - test_predict_qte_rejects_out_of_range_quantiles: values at the 0 / 1 boundary raise ValueError.

TomeHirata

LGTM

add QTE support for covariate adaptive randomization

4bd2594

Implement predict_qte in SimpleStratifiedDistributionEstimator and AdjustedStratifiedDistributionEstimator with stratified bootstrap (resampling within each stratum independently) to correctly estimate variance under CAR designs.

okiner-3 self-assigned this Apr 28, 2026

TomeHirata requested a review from Copilot April 29, 2026 07:50

Copilot started reviewing on behalf of TomeHirata April 29, 2026 07:51 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

TomeHirata reviewed May 4, 2026

View reviewed changes

TomeHirata previously approved these changes Jun 10, 2026

View reviewed changes

okiner-3 added 3 commits June 11, 2026 12:48

okiner-3 dismissed TomeHirata’s stale review via 59912fb June 11, 2026 03:55

okiner-3 requested a review from TomeHirata June 11, 2026 04:11

TomeHirata approved these changes Jun 11, 2026

View reviewed changes

TomeHirata merged commit 451d2f4 into main Jun 11, 2026
10 checks passed

TomeHirata deleted the feat/stratified-qte branch June 11, 2026 12:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add QTE support for covariate adaptive randomization#107

add QTE support for covariate adaptive randomization#107
TomeHirata merged 4 commits into
mainfrom
feat/stratified-qte

okiner-3 commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

TomeHirata May 4, 2026 •

edited

Loading

Uh oh!

okiner-3 Jun 10, 2026

Uh oh!

TomeHirata Jun 10, 2026

Uh oh!

TomeHirata left a comment

Uh oh!

TomeHirata left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

okiner-3 commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

TomeHirata May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

okiner-3 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

TomeHirata Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

TomeHirata left a comment

Choose a reason for hiding this comment

Uh oh!

TomeHirata left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TomeHirata May 4, 2026 •

edited

Loading